Mediterranean Sea
MUSE: A Simple Yet Effective Multimodal Search-Based Framework for Lifelong User Interest Modeling
Wu, Bin, Yang, Feifan, Chan, Zhangming, Gu, Yu-Ran, Feng, Jiawei, Yi, Chao, Sheng, Xiang-Rong, Zhu, Han, Xu, Jian, Ye, Mang, Zheng, Bo
Lifelong user interest modeling is crucial for industrial recommender systems, yet existing approaches rely predominantly on ID-based features, suffering from poor generalization on long-tail items and limited semantic expressiveness. While recent work explores multimodal representations for behavior retrieval in the General Search Unit (GSU), they often neglect multimodal integration in the fine-grained modeling stage -- the Exact Search Unit (ESU). In this work, we present a systematic analysis of how to effectively leverage multimodal signals across both stages of the two-stage lifelong modeling framework. Our key insight is that simplicity suffices in the GSU: lightweight cosine similarity with high-quality multimodal embeddings outperforms complex retrieval mechanisms. In contrast, the ESU demands richer multimodal sequence modeling and effective ID-multimodal fusion to unlock its full potential. Guided by these principles, we propose MUSE, a simple yet effective multimodal search-based framework. MUSE has been deployed in Taobao display advertising system, enabling 100K-length user behavior sequence modeling and delivering significant gains in top-line metrics with negligible online latency overhead. To foster community research, we share industrial deployment practices and open-source the first large-scale dataset featuring ultra-long behavior sequences paired with high-quality multimodal embeddings. Our code and data is available at https://taobao-mm.github.io.
- Asia > China > Beijing > Beijing (0.05)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Israel > Mediterranean Sea (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.61)
"As Eastern Powers, I will veto." : An Investigation of Nation-level Bias of Large Language Models in International Relations
Choi, Jonghyeon, Choi, Yeonjun, Kim, Hyun-chul, Jang, Beakcheol
This paper systematically examines nation-level biases exhibited by Large Language Models (LLMs) within the domain of International Relations (IR). Leveraging historical records from the United Nations Security Council (UNSC), we developed a bias evaluation framework comprising three distinct tests to explore nation-level bias in various LLMs, with a particular focus on the five permanent members of the UNSC. Experimental results show that, even with the general bias patterns across models (e.g., favorable biases toward the western nations, and unfavorable biases toward Russia), these still vary based on the LLM. Notably, even within the same LLM, the direction and magnitude of bias for a nation change depending on the evaluation context. This observation suggests that LLM biases are fundamentally multidimensional, varying across models and tasks. We also observe that models with stronger reasoning abilities show reduced bias and better performance. Building on this finding, we introduce a debiasing framework that improves LLMs' factual reasoning combining Retrieval-Augmented Generation with Reflexion-based self-reflection techniques. Experiments show it effectively reduces nation-level bias, and improves performance, particularly in GPT-4o-mini and LLama-3.3-70B. Our findings emphasize the need to assess nation-level bias alongside performance when applying LLMs in the IR domain.
- Europe > Russia (0.39)
- Asia > Russia (0.39)
- North America > United States (0.15)
- (18 more...)
- Law > International Law (1.00)
- Government > Military (1.00)
- Government > Foreign Policy (1.00)
- (2 more...)
Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response
Chen, Yiheng, Li, Lingyao, Ma, Zihui, Hu, Qikai, Zhu, Yilun, Deng, Min, Yu, Runlong
Effective disaster response is essential for safeguarding lives and property. Existing statistical approaches often lack semantic context, generalize poorly across events, and offer limited interpretability. While Large language models (LLMs) provide few-shot generalization, they remain text-bound and blind to geography. To bridge this gap, we introduce a Geospatial Awareness Layer (GAL) that grounds LLM agents in structured earth data. Starting from raw wildfire detections, GAL automatically retrieves and integrates infrastructure, demographic, terrain, and weather information from external geodatabases, assembling them into a concise, unit-annotated perception script. This enriched context enables agents to produce evidence-based resource-allocation recommendations (e.g., personnel assignments, budget allocations), further reinforced by historical analogs and daily change signals for incremental updates. We evaluate the framework in real wildfire scenarios across multiple LLM models, showing that geospatially grounded agents can outperform baselines. The proposed framework can generalize to other hazards such as floods and hurricanes.
- Europe > Austria > Vienna (0.14)
- North America > United States > California (0.05)
- Asia > Middle East > Jordan (0.04)
- (10 more...)
Securing AI Agents with Information-Flow Control
Costa, Manuel, Köpf, Boris, Kolluri, Aashish, Paverd, Andrew, Russinovich, Mark, Salem, Ahmed, Tople, Shruti, Wutschitz, Lukas, Zanella-Béguelin, Santiago
As AI agents become increasingly autonomous and capable, ensuring their security against vulnerabilities such as prompt injection becomes critical. This paper explores the use of information-flow control (IFC) to provide security guarantees for AI agents. We present a formal model to reason about the security and expressiveness of agent planners. Using this model, we characterize the class of properties enforceable by dynamic taint-tracking and construct a taxonomy of tasks to evaluate security and utility trade-offs of planner designs. Informed by this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selectively hiding information. Its evaluation in AgentDojo demonstrates that this approach enables us to complete a broad range of tasks with security guarantees. A tutorial to walk readers through the the concepts introduced in the paper can be found at https://github.com/microsoft/fides
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > Middle East > Palestine > Gaza Strip > Rafah Governorate > Rafah (0.04)
- Asia > Middle East > Israel > Mediterranean Sea (0.04)
- Africa > Cameroon > Gulf of Guinea (0.04)
- Research Report (0.63)
- Overview (0.45)
- Instructional Material > Course Syllabus & Notes (0.34)
Progent: Programmable Privilege Control for LLM Agents
Shi, Tianneng, He, Jingxuan, Wang, Zhun, Li, Hongwei, Wu, Linyu, Guo, Wenbo, Song, Dawn
LLM agents utilize Large Language Models as central components with diverse tools to complete various user tasks, but face significant security risks when interacting with external environments. Attackers can exploit these agents through various vectors, including indirect prompt injection, memory/knowledge base poisoning, and malicious tools, tricking agents into performing dangerous actions such as unauthorized financial transactions or data leakage. The core problem that enables attacks to succeed lies in over-privileged tool access. We introduce Progent, the first privilege control framework to secure LLM agents. Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones. Progent features a domain-specific language that allows for expressing fine-grained policies for controlling tool privileges, flexible fallback actions when calls are blocked, and dynamic policy updates to adapt to changing agent states. The framework operates deterministically at runtime, providing provable security guarantees. Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation, enhancing its practicality and potential for widespread adoption. Our extensive evaluation across various agent use cases, using benchmarks like AgentDojo, ASB, and AgentPoison, demonstrates that Progent reduces attack success rates to 0%, while preserving agent utility and speed. Additionally, we show that LLMs can automatically generate effective policies, highlighting their potential for automating the process of writing Progent's security policies.
- North America > United States (0.14)
- Asia > Middle East > Israel > Mediterranean Sea (0.04)
- Asia > Singapore (0.04)
MetaScenes: Towards Automated Replica Creation for Real-world 3D Scans
Yu, Huangyue, Jia, Baoxiong, Chen, Yixin, Yang, Yandan, Li, Puhao, Su, Rongpeng, Li, Jiaxin, Li, Qing, Liang, Wei, Zhu, Song-Chun, Liu, Tengyu, Huang, Siyuan
Embodied AI (EAI) research requires high-quality, diverse 3D scenes to effectively support skill acquisition, sim-to-real transfer, and generalization. Achieving these quality standards, however, necessitates the precise replication of real-world object diversity. Existing datasets demonstrate that this process heavily relies on artist-driven designs, which demand substantial human effort and present significant scalability challenges. To scalably produce realistic and interactive 3D scenes, we first present MetaScenes, a large-scale, simulatable 3D scene dataset constructed from real-world scans, which includes 15366 objects spanning 831 fine-grained categories. Then, we introduce Scan2Sim, a robust multi-modal alignment model, which enables the automated, high-quality replacement of assets, thereby eliminating the reliance on artist-driven designs for scaling 3D scenes. We further propose two benchmarks to evaluate MetaScenes: a detailed scene synthesis task focused on small item layouts for robotic manipulation and a domain transfer task in vision-and-language navigation (VLN) to validate cross-domain transfer. Results confirm MetaScene's potential to enhance EAI by supporting more generalizable agent learning and sim-to-real applications, introducing new possibilities for EAI research. Project website: https://meta-scenes.github.io/.
Pre-training Graph Neural Networks with Structural Fingerprints for Materials Discovery
Jia, Shuyi, Govil, Shitij, Ramprasad, Manav, Fung, Victor
In recent years, pre-trained graph neural networks (GNNs) have been developed as general models which can be effectively fine-tuned for various potential downstream tasks in materials science, and have shown significant improvements in accuracy and data efficiency. The most widely used pre-training methods currently involve either supervised training to fit a general force field or self-supervised training by denoising atomic structures equilibrium. Both methods require datasets generated from quantum mechanical calculations, which quickly become intractable when scaling to larger datasets. Here we propose a novel pre-training objective which instead uses cheaply-computed structural fingerprints as targets while maintaining comparable performance across a range of different structural descriptors. Our experiments show this approach can act as a general strategy for pre-training GNNs with application towards large scale foundational models for atomistic data.
- North America > United States (0.46)
- Asia > Middle East > Israel > Mediterranean Sea (0.24)
Neural Network Modeling of Microstructure Complexity Using Digital Libraries
Microstructure evolution in matter is often modeled numerically using field or level-set solvers, mirroring the dual representation of spatiotemporal complexity in terms of pixel or voxel data, and geometrical forms in vector graphics. Motivated by this analog, as well as the structural and event-driven nature of artificial and spiking neural networks, respectively, we evaluate their performance in learning and predicting fatigue crack growth and Turing pattern development. Predictions are made based on digital libraries constructed from computer simulations, which can be replaced by experimental data to lift the mathematical overconstraints of physics. Our assessment suggests that the leaky integrate-and-fire neuron model offers superior predictive accuracy with fewer parameters and less memory usage, alleviating the accuracy-cost tradeoff in contrast to the common practices in computer vision tasks. Examination of network architectures shows that these benefits arise from its reduced weight range and sparser connections. The study highlights the capability of event-driven models in tackling problems with evolutionary bulk-phase and interface behaviors using the digital library approach.
- North America > United States (0.28)
- Asia > Middle East > Israel > Mediterranean Sea (0.24)
- Energy > Oil & Gas (0.47)
- Health & Medicine (0.46)
- Aerospace & Defense (0.46)
Multi-field Visualization: Trait design and trait-induced merge trees
Lei, Danhua, Jankowai, Jochen, Hristov, Petar, Carr, Hamish, Denby, Leif, Masood, Talha Bin, Hotz, Ingrid
Feature level sets (FLS) have shown significant potential in the analysis of multi-field data by using traits defined in attribute space to specify features in the domain. In this work, we address key challenges in the practical use of FLS: trait design and feature selection for rendering. To simplify trait design, we propose a Cartesian decomposition of traits into simpler components, making the process more intuitive and computationally efficient. Additionally, we utilize dictionary learning results to automatically suggest point traits. To enhance feature selection, we introduce trait-induced merge trees (TIMTs), a generalization of merge trees for feature level sets, aimed at topologically analyzing tensor fields or general multi-variate data. The leaves in the TIMT represent areas in the input data that are closest to the defined trait, thereby most closely resembling the defined feature. This merge tree provides a hierarchy of features, enabling the querying of the most relevant and persistent features. Our method includes various query techniques for the tree, allowing the highlighting of different aspects. We demonstrate the cross-application capabilities of this approach through five case studies from different domains.
- North America > United States (0.28)
- Europe > Germany (0.28)
- North America > Canada (0.28)
- Asia > Middle East > Israel > Mediterranean Sea (0.24)
An Automatic Graph Construction Framework based on Large Language Models for Recommendation
Shan, Rong, Lin, Jianghao, Zhu, Chenxu, Chen, Bo, Zhu, Menghui, Zhang, Kangning, Zhu, Jieming, Tang, Ruiming, Yu, Yong, Zhang, Weinan
Graph neural networks (GNNs) have emerged as state-of-the-art methods to learn from graph-structured data for recommendation. However, most existing GNN-based recommendation methods focus on the optimization of model structures and learning strategies based on pre-defined graphs, neglecting the importance of the graph construction stage. Earlier works for graph construction usually rely on speciffic rules or crowdsourcing, which are either too simplistic or too labor-intensive. Recent works start to utilize large language models (LLMs) to automate the graph construction, in view of their abundant open-world knowledge and remarkable reasoning capabilities. Nevertheless, they generally suffer from two limitations: (1) invisibility of global view (e.g., overlooking contextual information) and (2) construction inefficiency. To this end, we introduce AutoGraph, an automatic graph construction framework based on LLMs for recommendation. Specifically, we first use LLMs to infer the user preference and item knowledge, which is encoded as semantic vectors. Next, we employ vector quantization to extract the latent factors from the semantic vectors. The latent factors are then incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics. We further design metapath-based message aggregation to effectively aggregate the semantic and collaborative information. The framework is model-agnostic and compatible with different backbone models. Extensive experiments on three real-world datasets demonstrate the efficacy and efffciency of AutoGraph compared to existing baseline methods. We have deployed AutoGraph in Huawei advertising platform, and gain a 2.69% improvement on RPM and a 7.31% improvement on eCPM in the online A/B test. Currently AutoGraph has been used as the main trafffc model, serving hundreds of millions of people.
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Leisure & Entertainment (1.00)
- Information Technology > Services (0.46)
- Media > Music (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)